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Abstract 

This  document  describes  the  Impulse  system  calls.  The  Impulse  system  calls  allow  user  applications  to 
use  remapping  functionality  provided  by  the  Impulse  Adaptive  Memory  System  to  reorganize  their  data 
structures  without  actually  moving  data  around  the  physical  memory.  Impulse  supports  several  remapping 
algorithms.  User  applications  choose  the  desired  remapping  algorithms  by  calling  the  right  Impulse  system 
calls.  This  note  uses  detailed  examples  to  illustrate  each  Impulse  system  call. 
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including  suggestions  for  reducing  this  burden,  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington 

VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  a  penalty  for  failing  to  comply  with  a  collection  of  information  if  it 
does  not  display  a  currently  valid  OMB  control  number. 

1.  REPORT  DATE 

20  JAN  1999 

2.  REPORT  TYPE 

3.  DATES  COVERED 

4.  TITLE  AND  SUBTITLE 

Reference  Manual  of  Impulse  System  Calls 

5a.  CONTRACT  NUMBER 

5b.  GRANT  NUMBER 

5c.  PROGRAM  ELEMENT  NUMBER 

6.  AUTHOR(S) 

5d.  PROIECT  NUMBER 

5e.  TASK  NUMBER 

5f.  WORK  UNIT  NUMBER 

7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Defense  Advanced  Research  Projects  Agency  (DARPA),3701  North 

Fairfax  Drive, Arlington, VA, 22203-1714 

8.  PERFORMING  ORGANIZATION 

REPORT  NUMBER 

9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

10.  SPONSOR/MONITOR'S  ACRONYM(S) 

11.  SPONSOR/MONITOR'S  REPORT 
NUMBER(S) 

12.  DISTRIBUTION/AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 

13.  SUPPLEMENTARY  NOTES 

The  original  document  contains  color  images. 

14.  ABSTRACT 

see  report 

15.  SUBIECT  TERMS 

16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION  OF 

ABSTRACT 

18.  NUMBER 

OF  PAGES 

23 

19a.  NAME  OF 

RESPONSIBLE  PERSON 

a.  REPORT 

unclassified 

b.  ABSTRACT 

unclassified 

c.  THIS  PAGE 

unclassified 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std  Z39-18 


Contents 


1  Introduction  2 

2  Impulse  System  Calls  2 

2.1  ams_mapshadow() .  2 

2.1.1  ams_mapshadow(AMS_TYPE_SUPERPAGE,  superpage^args_t  *datablock)  .  .  3 

2.1.2  ams_mapshadow(AMS_TYPE_BASESTRIDE,  basestride_args_t  *datablock)  4 

2.1.3  ams_mapshadow(AMS_TYPE_TRANSPOSE,  transpose_args_t  *datablock)  .  .  7 

2.1.4  ams_mapshadow(AMS_TYPE_PAGECOLOR, pagecolor  ^args_t  *datablock)  .  .  9 

2.1.5  ams_mapshadow(AMS_TYPE_VINDIRECT, vind±rect^args_t  *datablock)  .  .  11 

2.1.6  ams_mapshadow(AMS_TYPE_VINDIRECT2,  vind±rect2  ^args-t  *datablock)  13 

2.1.7  ams_mapshadow(AMS_TYPE_VINDIRECT3,  vind±rect3 ^args-t  *datablock)  17 

2.2  ams_remapshadow() .  18 

2.3  ams_allocvirt()  .  20 

2.4  ams_mapvtov() .  21 


1 


1  Introduction 


This  document  describes  the  Impulse  system  calls.  The  Impulse  system  calls  and  their  related  data  structures 
arc  defined  in  the  following  header  file. 


/nf s/ f lux/ imps rc/dist / simpulse/app-lib/ include /ams sup . h 


User  applications  should  include  this  header  file  and  link  with  the  following  object  file  to  pick  up  the  Impulse 
system  calls. 


/nf s/ flux/ imps rc/dist /simpulse/app-lib/ libkernel . o 


Currently,  the  Impulse  memory  system  supports  the  following  remappings:  no-copy  superpage  forma¬ 
tion,  strided  remapping,  transpose  remapping,  no-copy  page  coloring,  and  scatter/gather  through  an 
indirection  vector.  Depending  on  what  kind  of  values  are  stored  in  the  indirection  vector  and  how  the  indi¬ 
rection  vector  is  created,  scatter/gather  through  an  indirection  vector  is  further  split  into  three  sub-types: 
scatter/gather  through  an  index  vector,  if  the  indirection  vector  stores  indices  to  an  array;  scatter/gather 
through  an  offset  vector,  if  the  indirection  vector  stores  byte  offsets;  and  dynamic  cacheline  assembly,  if 
the  indirection  vector  is  dynamically  created  by  the  OS. 


2  Impulse  System  Calls 


The  section  describes  four  Impulse  system  calls: 


•  amsjnapshadow  ( )  sets  up  remappings  and  is  the  primary  interface  for  applications  to  access  Im¬ 
pulse  features; 

•  ams_remapshadow  ( )  allows  users  to  adjust  existing  remappings  previously  set  up  by  ams  jnapshadow  ( ) 

•  ams_allocvirt  ( )  allocates  virtual  memory,  used  in  conjunction  with  ams  jnapvtov  ( )  ; 

•  amsjtiapvtov  ( )  maps  a  set  of  virtual  addresses  to  a  specified  shadow  region,  used  in  conjunction 
with  ams_allocvirt  ( )  to  optimize  the  layout  of  shadow  data  in  virtually  indexed  caches. 

2.1  ams_mapshadow() 


int 
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ams_mapshadow ( int  type, 

void  *datablock) ; 

ams_mapshadow  ( )  is  the  main  Impulse  system  call.  Application  programs  use  this  system  call  to  set  up 
remappings.  The  argument  type  specifies  which  type  of  remapping  to  set  up.  The  argument  datablock 
is  the  address  of  a  type-specific  data  structure.  The  type  can  be  one  of  the  followings: 


•  AMS_TYPE_SUPERPAGE:  no-copy  superpage  formation; 

•  AMS _TYPE_BASE STRIDE:  strided  scatter/gather  remapping; 

•  AMS_TYPE_TRANSPOSE:  transpose  remapping; 

•  AMS  _TYPE_P  AGE  COLOR:  no-copy  page  coloring; 

•  AMS_TYPE_VINDIRECT:  scatter/gather  through  an  index  vector; 

•  AMS_TYPE_VINDIRECT2:  scatter/gather  through  an  offset  vector; 

•  AMS_TYPE_VINDIRECT3:  dynamic  cacheline  assembly. 


2.1.1  ams_mapshadow(AMS_TYPE_SUPERPAGE,  superpage _args_t  *datablock) 

ams_map shadow  ( AMS_TYPE_SUPERPAGE,  datablock )  sets  up  a  remapping  of  no-copy  superpage 
formation.  The  argument  datablock  points  to  a  superpage  .a  rgs_t  structure  defined  as  the  follow¬ 
ing: 


typedef  struct  { 
vaddr_t 
int 
int 
int 

}  superpage_args_t 


vaddr; 

size; 

pref count; 
pref inf o; 


This  call  maps  the  virtual  memory  region,  starting  at  address  vaddr  with  size  bytes  in  length,  to  an 
equally-sized  contiguous  shadow  memory  region.  The  Impulse  MMC  is  responsible  for  translating  the 
shadow  memory  region  back  to  the  physical  pages  to  which  the  original  virtual  memory  region  maps.  Both 
vaddr  and  size  must  be  page-aligned. 

prefetch  and  pref  info  contain  information  about  how  to  perform  prefetching  inside  the  shadow  re¬ 
gion:  pref  count  is  the  number  of  blocks  to  prefetch  each  time;  and  pref  info  is  the  prefetch  distance. 
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For  example,  assuming  that  prefcount  equals  2  and  prefinfo  equals  —3  (a  negative  value  means 
prefetching  backwards),  when  the  memory  controller  receives  a  load  request  for  block  A,  it  will  prefetch 
two  blocks  —  block  (A  —  3)  and  block  (A  —  6). 

Since  the  remapped  region  is  contiguous  in  both  virtual  memory  and  shadow  memory  after  remapping,  it 
can  use  supeipages  to  reduce  the  number  of  TLB  entries  required  to  map  it.  This  call  converts  the  virtual 
memory  region  to  TLB  superpages,  with  the  largest  possible  supeipages  allocated,  based  on  the  alignment 
of  vaddr.  Because  supeipages  must  be  aligned  to  their  sizes,  supeipages  arc  allocated  by  walking  the 
virtual  address  region  and  assigning  the  largest  possible  supeipage  restricted  by  the  cuixent  alignment.  For 
example,  if  the  cuixent  address  is  16Kbyte  aligned,  a  16Kbyte  superpage  can  be  assigned.  The  address  is 
then  incremented;  and  the  new  address  alignment  is  checked.  This  procedure  proceeds  until  the  end  of  the 
region  is  reached. 

Superpage  sizes  arc  powers  of  2  multiple  of  base  pages  ranging  from  8K  to  4M  bytes  in  size.  The  base 
page  size  is  4K  bytes.  In  practice,  the  first  few  and  last  few  pages  arc  often  base  pages,  with  the  pages  in 
the  middle  being  larger.  For  example,  a  virtual  memory  region  at  0x00039000  with  0x100000  bytes  will  be 
converted  to  9  pages  in  size  of  4K,  8K,  16K,  256K,  512K,  128K,  64K,  32K,  and  4K  bytes  respectively.  This 
conversion  reduces  the  number  of  TLB  entries  required  for  this  region  from  256  to  9. 

ams_mapshadow  ( )  returns  0  on  success  and  -1  otherwise. 

ERRORS: 

EINVAL  Either  vaddr  or  size  is  not  page-aligned. 

Figure  1  shows  a  simple  example  of  using  ams  nnapshadow  (AMS_TYPE  J3UPERPAGE,  .  .  .  ).  The 
example  contains  both  the  Impulse  version  and  the  non-impulse  version.  The  Impulse  version  has  IMPULSE 
defined  while  the  non-impulse  version  has  not. 


2.1.2  ams_mapshadow(AMS_TYPE_BASESTRIDE,  basestride_args_t  *datablock) 

ams_mapshadow  ( AMS _TYPE_BASE STRIDE ,  datablock)  sets  up  a  strided  scatter/gather  remap¬ 
ping.  The  argument  datablock  points  to  a  basestride^args_t  structure  defined  as  the  following: 


typedef  struct  { 
vaddr_t 
vaddr_t 
int 
int 
int 
int 


*newaddr; 
vaddr; 
count ; 
ob j  size; 
stride; 
offset; 
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/* 

*  Creates  superpages  for  array  A[count] 

*/ 

#def ine  PAGESIZE  0x1000 

double  foo (double  *A,  int  count) 

{ 

int  i ; 

double  sum  =  0; 
superpage_args_t  sp_args; 

#if def  IMPULSE 

sp_args . vaddr  =  (vaddr_t)  A; 

sp_args.size  =  count  *  sizeof (double) ; 

sp_args . pref count  =  0; 

sp_args . pref inf o  =  0; 

if  ( ams_mapshadow ( AMS_TYPE_SUPERPAGE ,  &sp_args)  <  0)  { 

printf ( "ams_mapshadow (AMS_TYPE_SUPERPAGE,  ...),  failed\n"); 
exit ( 1 ) ; 

} 

#endif 

for  (i  =  0;  i  <  count;  i  +=  PAGESIZE/sizeof (double) ) 
sum  +=  A [ i ] ; 

return  sum; 

} 


Figure  1:  A  code  fragment  using  ams  unapshadow  ( AMS_TYPE  J3UPERPAGE,  .  .  .  )  . 
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Original  data  structure 


Compacted  alias  array 


Figure  2:  Visualize  strided  remapping. 


int 

pref count ; 

int 

pref inf o; 

int 

salign; 

int 

valign; 

.ride_ 

_args_t; 

This  call  compacts  data  items  in  strided  memory  locations  into  a  dense  alias  array,  as  shown  by  Figure  2. 

vaddr  is  the  starting  address  of  the  data  structure  being  remapped  and  must  be  page-aligned.  The  data 
structure  has  count  strides,  each  of  which  is  stride  bytes  in  length.  Each  required  data  element  is 
ob  j  size  bytes  in  length;  and  its  byte  offset  from  the  base  of  the  stride  is  given  by  offset,  pref  count 
is  the  number  of  blocks  to  prefetch  each  time  and  pref  inf  o  is  the  prefetch  stride. 

The  OS  first  allocates  a  shadow  region  for  the  alias  array  storing  compacted  data  items.  The  shadow  region 
has  (count  x  objsize)  bytes.  The  OS  then  allocates  a  new  virtual  region  and  map  it  to  the  shadow 
region.  The  starting  address  of  the  new  virtual  region  is  stored  at  a  location  in  the  application  address 
space  pointed  to  by  newaddr.  salign  specifies  the  expected  alignment  of  the  new  shadow  region  and 
determines  where  the  alias  array  will  be  mapped  to  in  physically  index  caches  (like  most  L2  caches  in 
modern  microarchitectures);  and  valign  specifies  the  expected  alignment  of  the  new  virtual  region  and 
determines  where  the  alias  array  will  be  mapped  to  in  virtually  index  caches  (like  most  LI  caches  in  modern 
microarchitectures).  Both  salign  and  valign  must  be  page-aligned. 

amsjnapshadow  ( )  returns  0  on  success,  and  -1  otherwise.  If  successful,  the  virtual  address  of  the  alias 
array  is  placed  into  the  memory  location  pointed  to  by  newaddr. 

ERRORS: 

EINVAL  Either  vaddr  or  salign  or  valign  is  not  page-aligned. 

EFAULT  newaddr  specifies  an  invalid  address. 
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Figure  3  shows  a  simple  example  of  using  ams  jnapshadow  ( AMS _IYPE_BASESTRIDE,  .  .  .  ) .  The 
example  contains  both  the  Impulse  version  and  the  non-impulse  version.  The  Impulse  version  has  IMPULSE 
defined  while  the  non-impulse  version  has  not. 


2.1.3  ams_mapshadow(AMS_TYPE_TRANSPOSE,  transpose_args_t  *datablock) 

ams  jnapshadow  ( AMS  _TYPE  .TRANSPOSE,  datablock)  sets  up  a  transpose  remapping.  The  argu¬ 
ment  datablock  points  to  a  transpose _args_t  structure  defined  as  the  following: 


typedef  struct  { 


vaddr_t 

*newaddr ; 

vaddr_t 

vaddr; 

int 

elemsize; 

int 

rownum; 

int 

rowsize; 

int 

prefcount; 

int 

pref inf o; 

int 

salign; 

int 

valign; 

}  transpose_args_t ; 


This  system  call  maps  a  two-dimensional  matrix  to  its  transpose  without  copying. 

newaddr  is  a  pointer  to  a  location  in  the  application  address  space  where  the  kernel  can  store  the  return 
value,  vaddr  is  the  virtual  address  of  a  two-dimensional  matrix  and  must  be  page-aligned,  elemsize 
gives  the  size  of  matrix  element  in  bytes,  ro  wnum  gives  the  number  of  rows  that  the  two-dimensional  matrix 
has.  rowsize  gives  the  size  of  each  row  in  bytes.  (Thus,  the  matrix  has  rowsize/elemsize  columns.) 
prefcount  is  the  number  of  blocks  to  prefetch  each  time  and  pref  info  is  the  prefetch  distance,  as 
described  in  Section  2.1.1. 

The  return  value  of  this  function  is  the  virtual  address  of  a  new  matrix  -  the  transpose  of  the  original  matrix, 
salign  specifies  the  expected  alignment  of  the  new  matrix’s  shadow  region;  and  valign  specifies  the 
expected  alignment  of  the  new  matrix’s  virtual  region.  They  determine  where  the  new  matrix  will  be  mapped 
to  in  the  caches.  Both  of  them  must  be  page-aligned. 

ams  jnapshadow  ( )  returns  0  on  success,  and  -1  otherwise.  If  successful,  the  virtual  address  of  the  new 
matrix  is  placed  into  the  memory  location  pointed  to  by  newaddr. 

ERRORS: 


Either  vaddr  or  salign  or  valign  is  not  page-aligned. 
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EINVAL 


/* 

*  Compute  the  sum  of  A[8*i+2], 
*/ 


float  foo (float  *A,  int  size) 

{ 


basest ride_args_t 

float 

float 

int 


bs_args ; 
*Anew; 
sum  =  0; 
step  =  8; 


where  i  is  from  0  to  (size/8 


1)  • 


#if def  IMPULSE 

bs_args . newaddr 
bs_args .vaddr 
bs_args . count 
bs_args . ob jsize 
bs_args . stride 
bs_args . offset 
bs_args .prefcount 
bs_args .prefinfo 
bs_args . salign 
bs_args .valign 


(vaddr_t  *)  SAnew; 

(vaddr_t)  A; 
size  /  step; 
sizeof (float) ; 
sizeof (float)  *  step; 
sizeof (float)  *  2; 

1;  /*  prefetch  one  block  each  time  */ 

1;  /*  prefetch  forward  */ 

0x4000;  /*  Ai[0]  to  offset  0x4000  in  L2C  */ 
0x2000;  /*  Ai[0]  to  offset  0x2000  in  L1C  */ 


if  (ams_mapshadow (AMS_TYPE_BASE STRIDE,  &bs_args)  ==  -1)  { 

perror ( "ams_mapshadow (AMS_TYPE_BASESTRIDE,  ...)  failed."); 
exit ( 1 ) ; 


#endif 

for  (i  =  0;  i  <  size  /  step;  i++)  { 

#if def  IMPULSE 

sum  +=  Anew [ i ] ; 

#else 

sum  +=  A [ i  *  8  +  2 ] ; 

#endif 

} 


return  sum; 

} 


Figure  3:  A  code  fragment  using  amsumap shadow  ( AMS _TYPE_BASE STRIDE 


EFAULT 


newaddr  specifies  an  invalid  address. 


Figure  4  shows  a  simple  example  of  using  ams jnapshadow  ( AMS _TYPE .TRANSPOSE,  .  .  .  ) .  The 
example  contains  both  the  Impulse  version  and  the  non-impulse  version.  The  Impulse  version  has  IMPULSE 
defined  while  the  non-impulse  version  has  not. 


2.1.4  ams_mapshadow(AMS_TYPE .PAGECOLOR,  pagecolor ^args.t  *datablock) 

ams  jnapshadow  (AMS  .TYPE  .PAGE  COLOR,  datablock)  sets  up  a  remapping  of  no-copy  page  col¬ 
oring.  The  argument  datablock  points  to  a  pagecolor .args.t  structure  defined  as  the  following: 


typedef  struct  { 


vaddr_t 

vaddr; 

int 

size; 

int 

waysize; 

int 

colorfactor; 

int 

colorid; 

int 

pref count  ; 

int 

pref inf o; 

}  pagecolor_args_t ; 

This  call  sets  up  a  remapping  for  a  specified  virtual  region  so  that  the  whole  region  will  be  mapped  to  only 
a  designated  portion  of  a  physically  indexed  cache.  Figure  5  shows  how  to  use  page  coloring  to  map  a  data 
structure  to  the  third  quadrant  of  a  physically  indexed  L2  cache. 

vaddr  points  to  the  virtual  region  being  remapped  and  must  be  page-aligned,  size  gives  the  size  of  the 
virtual  region  in  bytes,  waysize  gives  the  way  size  of  targeted  physically  indexed  cache,  which  equals 
cache  size  divided  by  its  associativity,  colorfactor  is  number  of  colors  that  the  cache  is  split  into, 
colorid  is  the  index  of  the  color  to  which  the  virtual  region  will  solely  map.  In  figure  5,  colorfactor 
is  4  and  colorid  is  2.  pref  count  is  the  number  of  blocks  to  prefetch  each  time  and  pref  inf  o  is  the 
prefetch  distance. 

ams  jnapshadow  ( )  returns  0  on  success,  and  -1  otherwise. 

ERRORS: 

EINVAL  vaddr  is  not  page-aligned. 

Figure  6  shows  a  simple  example  of  using  ams  jnapshadow  (AMS_TYPE_PAGECOLOR,  .  .  .  ).  The 
example  contains  both  the  Impulse  version  and  the  non-impulse  version.  The  Impulse  version  has  IMPULSE 
defined  while  the  non-impulse  version  has  not. 
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/* 

*  Dense  matrix-matrix  multiplication:  C  =  A  *  B. 

*  A,  B,  and  C  are  (size  x  size)  matrices. 

*/ 

foo (double  *A,  double  *B,  double  *C,  int  size) 

{ 

transpose_args_t  tr_args; 
double  *Bnew,  sum; 

int  i,  j,  k; 

#if def  IMPULSE 

tr_args . newaddr  =  (vaddr_t  *)  &Bnew; 
tr_args . vaddr  =  (vaddr_t)  x; 

tr_args . elemsize  =  sizeof (double) ; 

tr_args . rownum  =  size; 

tr_args . rowsize  =  sizeof (double)  *  size; 

tr_args . pref count  =  1; 
tr_args . pref inf o  =  1; 

tr_args . salign  =  0;  /*  don't  care  */ 

tr_args . valign  =  0;  /*  don't  care  */ 

if  (ams_mapshadow (AMS_TYPE_TRANSPOSE,  &tr_args)  ==  -1)  { 

perror ( "ams_mapshadow (AMS_TYPE_TRANSPOSE,  ...)  failed."); 
exit ( 1 ) ; 

} 

#endif 

for  (i  =  0;  i  <  size;  i++) 

for  (j  =0;  j  <  size;  j++)  { 

for  (sum  =0,  k  =  0;  k  <  size;  k++) 

#if def  IMPULSE 

sum  +=  A[i] [k]  *  Bnew [ j ] [k] ; 

#else 

sum  +=  A [ i ]  [k]  *  B[k]  [j]; 

#endif 

C  [i]  [ j  ]  =  sum; 

} 

} 


Figure  4:  A  code  fragment  using  amsunap shadow  ( AMS _TYPE .TRANSPOSE,  .  .  .  )  . 
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size 


Figure  5:  Mapping  a  data  structure  to  the  third  quadrant  of  a  physically  indexed  L2  cache.  In  this  example,  color- 

factor  is  4;  colorid  is  2.  Note  colorsize  is  waysize  divided  by  colorfactor. 

2.1.5  ams_mapshadow(AMS_TYPE_VTNDIRECT,  vindirect _args_t  *datablock) 

ams_mapshadow  ( AMS_TYPE_VINDIRECT,  datablock)  sets  up  a  remapping  of  scatter/gather  through 
an  index  vector,  a  special  case  of  scatter/gather  through  an  indirection  vector  when  the  indirection  vector 
stores  array  indices.  The  argument  datablock  points  to  a  vindirect  _args _t  structure  defined  as  the 
following: 


typedef  struct 
vaddr_t 
vaddr_t 
int 
int 

vaddr_t 

int 

int 

int 

int 

int 

int 

int 

int 

int 

}  vindirect_arg 


{ 

*newaddr ; 
vaddr; 
count ; 
objsize; 
iv_vaddr ; 
iv_ob j count ; 
iv_ob jsize; 
isf ortran; 
iv_subtype; 
maxcount ; 
pref count; 
pref inf o; 
salign; 
valign; 
s_t  ; 


This  call  sets  up  a  region  of  shadow  addresses  mapped  to  a  one-dimensional  array  through  an  indirection 
vector.  That  is,  a  shadow  address  at  offset  sojfset  in  a  shadow  region  is  mapped  to  data  item  vector[soffset] 
in  physical  memory. 
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/* 

*  Map  A  to  the  second  half  of  L2  cache  and  B  to  the  first  half 

*  to  avoid  A[i]  and  B [ 1 ]  mapping  to  the  same  line  of 

*  a  two-way  set-associative,  256Kbytes  L2  cache. 

*/ 

foo (double  *A,  double  *B,  int  size) 

{ 

int  i ; 

double  sum  =  0; 

#if def  IMPULSE 

color_array (A,  size  *  sizeof (double) ,  2,  1); 
color_array (B,  size  *  sizeof (double) ,  2,  0); 

#endif 

for  (i  =  0;  i  <  SIZE;  i++) 
sum  +=  A [ i ]  +  B[i]; 

} 

#if def  IMPULSE 

color_array (void  *x,  int  size,  int  colorfactor,  int  colorid) 

{ 

pagecolor_args_t  args; 

args.vaddr  =  (vaddr_t)  x; 

args. size  =  size; 

args.waysize  =  128  *  1024;  /*  256Kbytes/2-way  */ 

args . colorfactor  =  colorfactor; 

args. colorid  =  colorid; 

args . pref count  =  1; 

args . pref inf o  =  1; 

if  (ams_mapshadow (AMS_TYPE_PAGECOLOR,  Sargs)  ==  -1)  { 

printf ( "ams_mapshadow_pagecolor  failed\n") ; 
exit  ( 1 ) ; 

} 

} 

#endif 


Figure  6:  A  code  fragment  using  amsunap shadow  ( AMS_TYPE_PAGECOLOR,  .  .  .  )  . 
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newaddr  is  be  a  pointer  to  a  location  in  the  application  address  space  where  the  kernel  can  store  the  return 
value,  vaddr  is  the  starting  virtual  address  of  the  original  one -dimensional  array  and  must  be  page-aligned. 
The  array  contains  count  elements  and  each  element  is  objsize  bytes  in  length.  iv_vaddr  is  the 
starting  virtual  address  of  the  indirection  vector.  The  indirection  vector  contains  ivxount  elements  and 
each  element  is  i v_ob  j s i z e  bytes  in  length,  isfortran  indicates  whether  or  not  the  indirection  vector 
stores  Fortran-style  array  subscripts,  i.e.,  subscripts  starting  from  1,  not  0  as  C/C++  does.  iv_subtype 
represents  subtype  of  this  remapping.  In  current  implementation,  iv_subtype  should  be  0  if  ivxount 
equals  maxcount  or  be  1  if  ivxount  is  less  than  maxcount.  pref  count  is  the  number  of  blocks  to 
prefetch  each  time  and  pref  inf  o  is  the  prefetch  distance. 

The  return  value  of  this  function  is  the  virtual  address  of  a  new  vector.  The  new  vector  has  maxcount 
elements.  When  maxcount  is  larger  than  iv_count,  the  indirection  vector  will  be  reused,  in  the  sense 
that  the  Oth  and  iv_countth  element  of  the  new  vector  both  use  the  0th  element  of  the  indirection  vector, 
salign  specifies  the  expected  alignment  of  the  new  vector’s  shadow  region;  and  valign  specifies  the 
expected  alignment  of  the  new  vector’s  virtual  region.  They  determine  where  the  new  vector  will  be  mapped 
to  in  the  caches.  Both  of  them  must  be  page-aligned. 

ams_mapshadow  ( )  returns  0  on  success,  and  -1  otherwise.  If  successful,  the  virtual  address  of  the  new 
vector  is  placed  into  the  memory  location  pointed  to  by  newaddr. 

ERRORS: 

EINVAL  Either  vaddr  or  iv_vaddr  or  salign  or  valign  is  not  page-aligned. 

EFAULT  newaddr  specifies  an  invalid  address. 

Figure  7  shows  a  simple  example  of  using  ams_map  shadow  ( AMS  _TYPE  WIND  IRE  CT ,  .  .  .  ).  The 

example  contains  both  the  Impulse  version  and  the  non-impulse  version.  The  Impulse  version  has  IMPULSE 
defined  while  the  non-impulse  version  has  not. 


2.1.6  ams_mapshadow(AMS _T YPE WIND IRECT 2 ,  vindirect2_args_t  *datablock) 

ams_mapshadow  ( AMS_TYPE_VINDIRECT2 ,  datablock)  sets  up  a  remapping  of  scatter/gather 

through  an  offset  vector,  a  special  case  of  scatter/gather  through  an  indirection  vector  when  the  indirection 
vector  stores  byte  offsets.  The  argument  datablock  points  to  a  vindirect2xirgs_t  structure  defined 
as  the  following: 

typedef  struct  { 

vaddr_t  offset; 

vaddr_t  size; 

}  ATTRIB; 

tdefine  MAX_ATTR  8 
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*  sum  =  A  x  P,  where  A  is  a  sparse  matrix  and  P  is  a  dense  vector. 

*  The  original  code  is  extracted  from  CG  of  NPB2 . 3 . 

*/ 

foo (double  *A,  double  *P,  int  *colidx,  int  *rows, 

int  acount,  int  pcount,  int  rowcount,  double  *sum) 

{ 

vindirect_args_t  vi_args; 
double  *pnew; 

int  i,  k; 

#if def  IMPULSE 

vi_args . newaddr  =  (vaddr_t  *)  Spnew; 

vi_args . vaddr  =  (vaddr_t)  P; 

vi_args . count  =  pcount; 

vi_args . ob jsize  =  sizeof (double) ; 

vi_args . iv_vaddr  =  colidx; 

vi_args . iv_count  =  acount; 

vi_args . iv_ob jsize  =  sizeof (int); 

vi_args . isfortran  =  1;  /*  CG  is  Fortran  code  */ 

vi_args . maxcount  =  acount; 

vi_args . pref count  =  1; 

vi_args . pref inf o  =  1; 

vi_args . salign  =  0;  /*  don't  care  */ 

vi_args . valign  =  0;  /*  don't  care  */ 

if  (ams_mapshadow (AMS_TYPE_VINDIRECT,  &vi_args)  ==  -1)  { 

perror ( "ams_mapshadow (AMS_TYPE_VINDIRECT,  ...)  failed."); 
exit ( 1 ) ; 

} 

#endif 

for  (i  =  0;  i  <  rowcount;  i++)  { 

for  (k  =  rows [i] ;  k  <  rows[i+l];  k++) 

#if def  IMPULSE 

sum[k]  =  A[k]  *  pnew[k]; 

#else  /*  Non-Impulse  version  */ 

sum[k]  =  A[k]  *  p [colidx [k] ] ; 

#endif 

} 

} 


Figure  7:  A  code  fragment  using  amsunap shadow  ( AMS_TYPE_VINDIRECT,  .  .  .  )  . 
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typedef  struct  { 

vaddr_t  *newaddr; 
vaddr_t  vaddr; 

int  count; 

int  size; 

vaddr_t  iv_vaddr; 

int  iv_objsize; 

int  prefcount; 

int  prefinfo; 

int  attribs_num; 

ATTRIB  attribs [MAX_ATTR] ; 

}  vindirect2_args_t; 

This  system  call  was  specifically  designed  for  the  PostS  QL  database  management  program.  The  main  data 
structures  of  PostS  QL  are  active  pages.  Each  active  page  has  the  same  format:  a  small  header,  followed  by 
an  offset  vector,  followed  by  database  records.  The  last  attribute  of  a  database  record  varies  in  length,  which 
make  the  database  records  vary  in  size  too.  The  offset  vector  stores  each  record’s  byte  offset  from  the  base 
of  the  active  page.  This  call  maps  the  required  attributes  of  database  records  in  an  active  page  into  a  dense 
shadow  region. 

newaddr  is  a  pointer  to  a  location  in  the  application  address  space  where  the  kernel  can  store  the  return 
value,  vaddr  is  the  virtual  address  of  an  active  page  and  must  be  page-aligned,  size  is  the  number 
of  bytes  in  the  active  page,  count  is  the  number  of  database  records  in  the  active  page.  iv_vaddr  and 
iv_ob  j  size  arc  the  virtual  address  and  element  size  of  the  offset  vector  in  the  active  page,  attribs  mum 
is  the  number  of  attributes  to  be  gathered  in  each  database  record,  with  eight  as  its  maximum  value.  The 
offset  and  size  of  each  required  attribute  arc  stored  in  array  attribs  [  ] .  prefcount  is  the  number  of 
blocks  to  prefetch  each  time  and  prefinfo  is  the  prefetch  distance. 

The  return  value  of  this  function  is  the  virtual  address  of  a  new  vector.  Each  element  of  this  vector  contains 
all  gathered  attributes  of  a  database  record. 

ams_mapshadow  ( )  returns  0  on  success,  and  -1  otherwise.  If  successful,  the  virtual  address  of  the  new 
vector  is  placed  into  the  memory  location  pointed  to  by  newaddr. 

ERRORS: 

EINVAL  vaddr  is  not  page-aligned. 

EFAULT  newaddr  specifies  an  invalid  address. 

Figure  8  shows  a  simple  example  of  using  amsunap shadow  ( AMSiTYPE_VINDIRECT2 ,  .  .  .  ) .  The 

example  contains  both  the  Impulse  version  and  the  non-impulse  version.  The  Impulse  version  has  IMPULSE 
defined  while  the  non-impulse  version  has  not. 
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typedef  struct  { 
int  a; 
float  b; 
int  c; 
float  d; 

}  RECORD; 
typedef  struct  { 
int  a; 
float  d; 

}  OBJECT; 

/* 

*  Sum  up  attributes 
*/ 

float  foo (RECORD  *A, 


''a"  and  "d"  of  RECORD 

int  *offsets,  int  asize,  int  record_count , 
float  *sumd) 


vindirect2_args_t  vi2_args; 
OBJECT  * objects; 

RECORD  ^record; 

#if def  IMPULSE 

vi2_args . newaddr  = 

vi2_args . vaddr  = 

vi2_args . count  = 

vi2_args . size  = 

vi2_args . attribs_num  = 

vi2_args . attribs [ 0 ] . of f set  = 
vi2_args . attribs [ 0 ] . size  = 
vi2_args . attribs [ 1 ]. of f set  = 
vi2_args . attribs [ 1 ]. size  = 
vi2_args . of f set_vaddr  = 

vi2_args . of f set_ob jsize  = 
vi2_args . pref count  = 

vi2_args . pref inf o  = 


(vaddr_t  *)  & objects; 

(vaddr_t)  A; 
record_count ; 
asize; 

2; 

0; 

sizeof (int ) ; 

sizeof(int)  *  2  +  sizeof (float) ; 
sizeof (float ) ; 

(vaddr_t)  offsets; 
sizeof (int ) ; 

1; 

1; 


if  (ams_mapshadow (AMS_TYPE_VINDIRECT2 ,  &vi2_args)  ==  -1)  { 

perror ( "ams_mapshadow (AMS_TYPE_VINDIRECT2 ,  ...)  failed."); 

exit ( 1 ) ; 

} 

for  (int  i  =  0;  i  <  record_count ;  i++)  { 

*suma  +=  ob jects [ i ] . a;  *sumd  +=  ob jects [ i ] . d; 

} 

#else 

for  (int  i  =  0;  i  <  record__count ;  i++)  { 

record  =  (RECORD  *)  ((vaddr_t)  A  +  (vaddr_t)  offsets [i]); 
*suma  +=  record->a;  *sumd  +=  record->d; 

} 

#endif 
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Figure  8:  A  code  fragment  using  ams unapshadow  ( AMS_TYPE_VINDIRECT2 


2.1.7  ams_mapshadow(AMS_TYPE_VTNDIRECT3,  vindirect3_args_t  *datablock) 

ams_mapshadow  ( AMS_TYPE_VINDIRECT3 ,  datablock)  sets  up  dynamic  cacheline  assembly,  a 

variation  of  scatter/gather  through  an  indirection  vector  when  the  indirection  vector  is  dynamically  cre¬ 
ated  by  this  system  call.  The  argument  datablock  points  to  a  vindirect 3 ^irgs _t  structure  defined 
as  the  following: 


typedef  struct 
vaddr_t 
vaddr_t 
vaddr_t 
vaddr_t 
int 
int 
int 
int 
int 
int 
int 
int 

}  vindirect3_ar 


{ 

*newaddr  ; 
*±v_newaddr ; 
vaddr; 
s  i  z  e  ; 
ob j  size; 
ob jcount; 
iv_ob jsize; 
iv_subtype; 
pref count; 
pref inf o; 
salign; 
valign; 
g  s_t  ; 


This  system  call  creates  an  indirection  vector  and  an  alias  array.  To  access  data  using  the  alias  array,  the 
application  program  must  first  fill  in  the  indirection  vector,  then  flush  it  back  to  the  memory. 

newaddr  and  ivjiewaddr  arc  two  pointers  to  locations  in  the  application  address  space  where  the  kernel 
can  store  the  address  of  the  alias  array  and  indirection  vector,  vaddr  points  to  the  starting  place  of  a 
virtual  region  inside  which  data  will  be  gathered  from,  size  is  the  number  of  bytes  in  this  virtual  region, 
obj  size  is  the  size  of  data  item  being  gathered,  ob  jcount  is  the  number  of  elements  in  the  alias  array 
or  indirection  vector.  iv_object  is  the  size  of  each  element  in  the  indirection  vector,  iv^subtype 
indicates  the  type  of  values  stored  in  the  indirection  vector:  0  means  array  indices,  1  means  byte  offsets  in 
virtual  memory,  2  means  virtual  addresses,  3  means  shadow  addresses,  and  4  means  real  physical  addresses 1 . 
pref  count  is  the  number  of  blocks  to  prefetch  each  time  and  pref  inf  o  is  the  prefetch  distance. 

The  return  values  of  this  function  arc  the  virtual  addresses  of  a  new  alias  array  and  a  new  indirection  vector, 
salign  specifies  the  expected  alignment  of  the  new  alias  array’s  shadow  region;  and  valign  specifies 
the  expected  alignment  of  the  new  alias  array’s  virtual  region.  They  determine  where  the  alias  array  will  be 
mapped  to  in  the  caches.  Both  of  them  must  be  page-aligned. 

'Note  that  only  subtype  0  is  currently  fully  supported  by  the  simulator.  Other  types  will  be  supported  if  found  necessary  later. 
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ams_mapshadow  ( )  returns  0  on  success,  and  -1  otherwise.  If  successful,  the  virtual  address  of  the  new 
alias  array  is  placed  into  the  memory  location  pointed  to  by  newaddr;  the  virtual  address  of  the  new 
indirection  vector  is  placed  into  the  memory  location  pointed  to  by  iv_newaddr; 

ERRORS: 

EINVAL  Either  vaddr  or  s align  or  valign  is  not  page-aligned. 

EFAULT  newaddr  or  iv_newaddr  specifies  an  invalid  address. 

Figure  9  shows  a  simple  example  of  using  amsanap shadow  ( AMS  HYPE _VIND IRE CT 3 ,  .  .  .  ) .  The 

example  contains  both  the  Impulse  version  and  the  non-impulse  version.  The  Impulse  version  has  IMPULSE 
defined  while  the  non-impulse  version  has  not. 


2.2  ams_remapshadow() 

int 

ams_remapshadow (int  type, 

void  *datablock, 

int  flags ) ; 

ams_remapshadow  ( )  allows  user  applications  to  adjust  the  parameters  of  existing  remappings.  The 
argument  type  specifies  remapping  type.  The  argument  datablock  should  point  to  a  structure  associ¬ 
ated  with  type.  The  newaddr  of  datablock  must  be  a  virtual  address  returned  by  a  previous  call  to 
ams_mapshadow  ( ) .  It  allows  the  kernel  to  find  the  previous  setting  of  a  specific  remapping.  The  argument 
flags  specifies  what  kinds  of  change  to  make. 

Currently,  there  arc  only  a  very  limited  set  of  parameters  allowed  to  be  reset.  More  parameters  would  be 
allowed,  should  the  needs  arise.  By  now,  type  can  be  one  of  the  followings:  AMS  _T YPE  _B AS E STRIDE  or 
AMS_TYPE_VINDIRECT3.  For  AMS_TYPE_BASESTRIDE,  flags  can  be  one  of  the  followings: 


•  AMS -REMAP  _PURGE  —  purges  the  associated  shadow  region  data  out  of  CPU  caches; 

•  AMS-REMAP -FLUSH  —  flushes  the  associated  shadow  region  data  back  to  main  memory; 

•  AMS-REMAP -STRIDE  —  resets  the  associated  remapping  with  new  stride  value  in  datablock; 

•  AMS_REMAP -OFFSET  —  resets  the  associated  remapping  with  new  offset  value  in  datablock. 


For  AMS_TYPE_VINDIRECT  3,  flags  can  be  one  of  the  followings: 


•  AMS -REMAP -VADDR  —  resets  the  stalling  address  of  the  original  virtual  region. 
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/* 

*  Optimize  a  random  access  loop  using  AMS_TYPE_VINDIRECT3 . 

*  Basic  idea:  in  each  iteration,  precompute  32  addresses  then 

*  access  32  data  items. 

■k 

*/ 

#def ine  OBJCOUNT  32 

float  foo (float  *array,  int  size,  int  itcount) 

{ 

vindirect3_args_t  args; 
float  *alias_array,  sum; 
int  *idx_vector,  i,  j; 

#if def  IMPULSE 

args.newaddr  =  (vaddr_t  *)  &alias_array; 

args . iv_newaddr  =  (vaddr_t  *)  &idx_vector; 

args.vaddr  =  (vaddr_t)  array; 

args. size  =  sizeof (float)  *  size; 

args.objsize  =  sizeof (float)  ; 

args . ob jcount  =  OBJCOUNT; 

args . iv_ob jsize  =  sizeof (int); 

args . pref count  =  0;  args . pref inf o  =  0; 

args.salign  =  0;  args.valign  =  0; 

if  (ams_mapshadow (AMS_TYPE_VINDIRECT3,  Sargs)  ==  -1)  { 

printf ( "ams_mapshadow (AMS_TYPE_VINDIRECT3,  ...)  failed. \n"); 
exit ( 1 ) ; 

} 

for  (sum  =0,  i  =  0;  i  <  itcount  /  OBJCOUNT;  i++)  { 

/*  Precompute  addresses  */ 
for  (j  =  0;  j  <  OBJCOUNT;  j++) 

idx_vector [ j ]  =  random ( )  %  size; 

f lush_cacheline ( 0 ,  (vaddr_t)  & ( idx_vector [ 0 ] ) ) ; 

/*  Access  data  */ 
for  (j  =  0;  j  <  OBJCOUNT;  j++) 
sum  +=  alias_array [ j ] ; 

purge_cacheline ( 0 ,  (vaddr_t)  & (alias_array [0] ) ) ; 

} 

#else  /*  Non-Impulse  version  */ 

for  (sum  =0,  i  =  0;  i  <  itcount;  i++) 
sum  +=  array [ random ( )  %  size]; 

#endif 

return  sum; 


Figure  9:  A  code  fragment  using  ams onapshadow  ( AMS_TYPE_VINDIRECT3,  .  .  .  )  . 


/* 

*  Change  the  "offset"  or  "stride"  of  a  strided  remapping  identified  by 

*  "vsaddr"  which  was  returned  by  a  previous  call  to  ams_mapshadow ( ) . 

*/ 

foo (double  *vsaddr,  int  value,  int  flags) 

{ 

basestride_args_t  bs_args; 

bs_args . newaddr  =  (unsigned  *)  Svsaddr; 

if  (flags  &  AMS_REMAP_OFFSET) 
bs_args . of f set  =  value; 

if  (flags  &  AMS_REMAP_STRIDE) 
bs_args . stride  =  value; 

if  (ams_remapshadow (AMS_TYPE_BASESTRIDE,  &bs_args,  flags)  !=  0)  { 

perror ( "ams_remapshadow  failed. " )  ; 
exit ( 1 ) ; 

} 

} 


Figure  10:  Code  fragment  illustrating  ams_remapshadow  ( )  usage. 


ams_remapshadow  returns  Oon  success,  and  -1  otherwise. 

ERRORS: 

EINVAL  newaddr  does  not  specify  a  valid  shadow  range. 
Figure  10  shows  a  simple  code  fragment  using  ams_remapshadow  ( ) . 


2.3  ams_allocvirt() 

unsigned  long 
ams_allocvirt ( int  size, 

int  alignment) ; 

User  applications  can  use  this  system  call  and  ams  jnapvtov  ( )  (described  in  Section  2.4)  to  optimize  their 
shadow  data  layout  in  virtually  indexed  caches,  amsmllocvirt  ( )  allocates  a  new  virtual  region  which 
will  be  used  by  amsjnapvtov  ( )  .  The  new  virtual  region  is  not  mapped  to  any  physical  addresses,  so  it 
cannot  be  used  until  it  has  be  mapped  to  a  region  of  shadow  addresses  by  amsjnapvtov  ( ) . 
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ams_allocvirt  ( )  returns  the  base  of  allocated  virtual  region  on  success,  and  -1  otherwise. 
ERRORS: 

EINVAL  Either  size  or  alignment  is  not  page-aligned. 


2.4  ams_mapvtov() 


int 

ams_mapvtov (unsigned  srcvaddr, 
unsigned  dstvaddr 
int  size) ; 

User  applications  uses  ams_allocvirt  ()  (described  in  Section  2.3)  and  this  system  call  to  optimize 
shadow  data  layout  in  virtually  indexed  caches  (such  as  most  Li  caches  in  modern  microarchitectures). 
ams_mapvtov  ( )  remaps  a  region  of  virtual  addresses  stalling  at  srcvaddr  to  a  region  of  physical  ad¬ 
dresses  originally  mapped  through  the  virtual  region  at  dstvaddr.  The  size  (in  bytes)  of  remapped  region 
is  given  by  size.  All  of  srcvaddr,  dstvaddr,  and  size  must  be  page-aligned  values.  The  range  of 
virtual  address  space  specified  by  dstvaddr  and  size  must  map  to  a  shadow  address  region  previously 
allocated  through  a  call  to  ams unapshadow  ( )  . 

It  is  worth  noting  that  the  application  can  really  screw  itself  with  this  call,  since  the  kernel  will  allow 
any  region  of  the  process  ’  virtual  address  space  to  be  remapped  to  any  region  of  shadow  address  space 
previously  allocated  by  the  process. 

ams_mapvtov  ( )  returns  0  on  success,  and  -1  otherwise. 

ERRORS: 

EINVAL  Either  srcvaddr  or  dstvaddr  is  not  page-aligned. 

E I NVAL  s  i  z  e  is  not  page-aligned. 

EINVAL  dstvaddr  and  size  do  not  specify  a  valid  shadow  range. 

Figure  11  shows  a  simple  example  of  using  ams_allocvirt  ( )  and  ams  jnapvtov  ( )  . 
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/* 

*  This  function  maps  A,  B,  and  C  to  the  first,  second,  and  third 

*  quadrant  of  LI  cache  respectively. 

*  Assume  A,  B,  and  C  have  the  same  size  as  one-fourth  of  LI  cache, 

*  and  each  of  them  maps  to  a  shadow  region. 

*/ 

foo (void  *A,  void  *B,  void  *C,  int  size) 

{ 

void  *Av; 

if  ( (Av  =  ams_allocvirt ( 3  *  size,  L1_CACHE_SIZE)  ==  -1)  { 

perror ( "ams_allocvirt ( )  failed. " )  ; 
exit ( 1 ) ; 

} 

if  ( (ams_mapvtov (Av,  A,  size)  ==  -1)  | 

(ams_mapvtov (Av+size,  B,  size)  ==  -1)  | 

(ams_mapvtov (Av+2*size,  C,  size)  ==  -1))  { 

perror ( "ams_mapvtov ( )  failed. " ) ; 
exit ( 1 ) ; 

} 

} 


Figure  11:  Using  ams_allocvirt  ( )  and  arris  map  vtov  ( )  to  optimize  data  layout  in  virtually  indexed 
LI  cache. 
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